Experimental design: Switching every 4 trials in an alternating runs manner (no cues). Total of 8 experimental blocks were intended, but 60 subjects saw only 7 blocks. Across these blocks counterbalancing of four conditions: Both tasks unambiguous (1), both tasks ambiguous (4), shape ambiguous when irrelevant, color always ambiguous (2), shape always ambiguous, color unambiguous when irrelevant (2)
What the variables mean:
block: 0 for practice, there are a total of 8 blocks per part and each block consists of 112 trials. So 896 trials per participant over all 7 blocks (not counting practice)
bal: counterbalancing of conditions across blocks
x, y, c2: irrelevant (already taken out)
cycle: counting within full alternating cycle (8), switch at 1 and 5
task: 1=shape, 2=color
dimshape=specific shapes–4=neutral
dimcolor=specific color–4=neutral
correct: correct response (i.e., value of the currently relevant task dimension)
error: 0 = no error, 1 = yes error
response: actual response
RT: or response time
Import data-set and some packages
#open neccesary packages herelibrary(tidyverse)
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.5
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.5.1 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.1
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
library(janitor)
Warning: package 'janitor' was built under R version 4.4.1
Attaching package: 'janitor'
The following objects are masked from 'package:stats':
chisq.test, fisher.test
library(readr)library(rio)library(psych) #generate metrix w scatterplot and cor
Attaching package: 'psych'
The following objects are masked from 'package:ggplot2':
%+%, alpha
Rows: 94754 Columns: 16
── Column specification ────────────────────────────────────────────────────────
Delimiter: ","
dbl (16): id, bal, block, x, cond, trial, y, c2, cycle, task, dim1, dim2, co...
ℹ Use `spec()` to retrieve the full column specification for this data.
ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#use pivot long and/or? wide here with some key variables we want to look at. may need to alter df to turn some 1s and 0s in columns to be names... (correct, incorrect or color, shape). fix code below...# alt_run %>% # pivot_wider(names_from = task, values_from = block)
Determine and Remove Outliers (Error way…)
# we are testing for accuracy, so we need at least 80% accuracy in all trials per participant #determine 80% accuracy crit <-896- (896* .8)crit # need at least 179 out of 896 trials to be correct, denoted by 0 in error col
#WHAT TO DO: z-score on each seq position x switch x ambiguity on RT then z-score on each block (to account for some participants only doing 7 instead of 8 blocks) #STEP 1: separate switch trials, c(1,5) and control trials !c(1,5)alt_run <- alt_run %>%mutate(trial_type =if_else(cycle %in%c(1,5), 'switch', 'control'))alt_run
#STEP 2.1: Interpret the data#so this is telling us that our mean z-score for both control and switch is basically 0 (which is what we want to see) and that our z sd is 1 (which is also what we want to see). looking at the mean for RT in both switch and control, we see that the response time means tend to be a lot longer on average than the average response time for control trials (non-switch trials). this is so cool!rtdif <- z_scoretrial %>%summarize(meandif =889.9560-579.1123) rtdif #difference in means by 310.8437 where the switch trial takes 310.84 ms longer than the control or non-switch trials.
# A tibble: 1 × 1
meandif
<dbl>
1 311.
#STEP 2: Look at mean RTs by Trial_type, look at z-scores#STEP 4: Start steps for z-score on each block
Descriptive Graphs
1. Histogram of RT
mean_rt <-mean(alt_run$RT, na.rm =TRUE)mean_rt
[1] 860.4441
sd_rt <-sd(alt_run$RT, na.rm =TRUE)alt_run %>%ggplot(aes(x=RT)) +geom_histogram(aes(y =after_stat(density)), fill ='darkgreen', color ='darkblue') +geom_vline(aes(xintercept = mean_rt) , color ='red', linetype ='dashed', size =1.5) +theme_minimal() +stat_function(fun = dnorm, args =list(mean = mean_rt, sd = sd_rt) , col ='gold', size =1.5) +labs(x='Response Times (ms)', y='Density', title ='Density plot of Response Times', subtitle ='The mean and normal density curve of RTs')
Warning: Using `size` aesthetic for lines was deprecated in ggplot2 3.4.0.
ℹ Please use `linewidth` instead.
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
plotly::ggplotly()
`stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
2. Boxplot of RT
#boxplot of all RTs regardless of taskboxplot(alt_run_1$RT)
#boxplot of RTs when doing shape taskboxplot_s <-filter(alt_run_1, task =='shape')boxplot(boxplot_s$RT)
#boxplot of RTs when doing color taskboxplot_c <-filter(alt_run_1, task =='color')boxplot(boxplot_c$RT)
3. Correlations
cor_alt <- alt_run %>%select(RT, cycle, task)cor(cor_alt, use ="complete.obs")
#is there a correlation between response times and error rate? also note: used the psych package to generate this alt_run %>%select(RT, error) %>%pairs.panels(lm =TRUE)
5. Scatterplots in select
What is the relationship between error and response time?
What is the position of the cycle and the relationship from that to the response time?
What are the dynamics of switching tasks? (5th cycle or 1st)
Is there a difference in response time when people switch from task to another?
#relationship between error and response time#There are only two conditions of error: 0= No error, 1= Yes error#Makes scatterplots relations with RT on 2 linear lines. alt_run %>%ggplot(aes(RT,error))+geom_point()
#position of cylce and relationship with response time#relationship between Response Time and Cycle also produces some scatterplots#output is liner and not sure of what it says about the dataalt_run %>%ggplot(aes(RT,cycle))+geom_point()
#position of cycle in relation to response time?alt_run %>%ggplot(aes(cycle,RT)) +geom_point()
# task and response timealt_select %>%ggplot(aes(RT,task))+geom_point()
7. Pivoting
This code pivots the data so that each trial type has its own column (“switch” and “control”), and each column holds the reponse time trial.
alt_run_wide <- alt_run %>%pivot_wider(names_from = trial_type, # Use trial_type as column namesvalues_from = RT, # Fill columns with response_time valuesnames_prefix ="response_time_" )
Warning: Values from `RT` are not uniquely identified; output will contain list-cols.
• Use `values_fn = list` to suppress this warning.
• Use `values_fn = {summary_fun}` to summarise duplicates.
• Use the following dplyr code to identify duplicates.
{data} |>
dplyr::summarise(n = dplyr::n(), .by = c(id, bal, block, x, cond, trial, y,
c2, cycle, task, dimshape, dimcolor, correct, error, response, trial_type))
|>
dplyr::filter(n > 1L)
8. Descriptive table
The descriptive table below shows the mean, median, and standard deviation of participants’ response times in the control versus switch trials.
library(tidyr)#converting the columns to numeric for the descriptives tablealt_run_wide$response_time_control <-as.numeric(as.character(alt_run_wide$response_time_control))